Setup

Install the package for the first time. devtools::install_github("ccb-hms/phonto")

library(phonto)
library(DT)
library(nhanesA)


sqlHost <- "localhost"
sqlUserName <- "sa"
sqlPassword <- "yourStrong(!)Password"
sqlDefaultDb <- "NhanesLandingZone"

Step 1: search tables

It can be search by certain variable.

1) The function of search table names only by veriable has been replicated

# nameonly <- nhanesSearchVarName('BPXPULS')
# nameonly

# replicate function
nameonly1 <- searchTablesByVar('BPXPULS')
nameonly1
#>    Questionnaire     TableName
#> 1            BPX BloodPressure
#> 2          BPX_B BloodPressure
#> 3          BPX_C BloodPressure
#> 4          BPX_D BloodPressure
#> 5          BPX_E BloodPressure
#> 6          BPX_F BloodPressure
#> 7          BPX_G BloodPressure
#> 8          BPX_H BloodPressure
#> 9          BPX_I BloodPressure
#> 10         BPX_J BloodPressure

2) We need the meta data in the databse to replicate the details if we need them.

# details <- nhanesSearchVarName('BPXPULS', includerdc=TRUE, ystart=2001, ystop=2008, nchar=20,namesonly=F)
# datatable(details)
# But we can do something like this now:
searchTableByName("BPX",ystart=2001, ystop=2008)
#>   Questionnaire     TableName
#> 1         BPX_B BloodPressure
#> 2         BPX_C BloodPressure
#> 3         BPX_D BloodPressure
#> 4         BPX_E BloodPressure

3) Joint query

Replicated the name only version and need meta data in database to replicate the details

cols = c("RIDAGEYR","RIAGENDR","BMXBMI","DMDEDUC2")
data = jointQuery(c('BodyMeasures','DemographicVariablesAndSampleWeights'),cols)
DT::datatable(data)

3) Union Query

tablnames = searchTableByName('BPX[_]')
tablnames
#>   Questionnaire     TableName
#> 1         BPX_B BloodPressure
#> 2         BPX_C BloodPressure
#> 3         BPX_D BloodPressure
#> 4         BPX_E BloodPressure
#> 5         BPX_F BloodPressure
#> 6         BPX_G BloodPressure
#> 7         BPX_H BloodPressure
#> 8         BPX_I BloodPressure
#> 9         BPX_J BloodPressure
blood_df <- unionQuery(tablnames$TableName,cols =c("BPXDI1","BPXDI2","BPXSY1","BPXSY2"))
DT::datatable(blood_df)

3) query by variables

We need to ensure all the variables can be found across the same table group; otherwise, the function will throw an error.

searchTablesByVar("URXDMA")
#>   Questionnaire                                                   TableName
#> 1      L06PHY_C                                         PhytoestrogensUrine
#> 2        PHPYPA PhthalatesPhytoestrogensAndPAHsUrinePHPYPAUrinaryPhthalates
#> 3      PHPYPA_B                        PhthalatesPhytoestrogensAndPAHsUrine
#> 4       PHYTO_D                                         PhytoestrogensUrine
#> 5       PHYTO_E                                         PhytoestrogensUrine
#> 6       PHYTO_F                                         PhytoestrogensUrine
phy_urine = c("URXDAZ","URXDMA","URXEQU", "URXETD","URXETL","URXGNS")

# another new function that allow users query data by a group variables
phy_urine_df = queryByVars(phy_urine)
DT::datatable(phy_urine_df)

Step 2: check tables

1) A quick way to do it is to show the table names’ details, and we can find the table named “P_BPXO” and “BPXO_J” are different from the rest of the tables that present blood pressures.

tb_detail <- nhanesSearchTableNames('BPX', includerdc=TRUE, nchar=42, details=TRUE)
datatable(tb_detail)

2) We can also check tables by showing the variables and descriptions in the tables.

For example,

var_detail <- variableDescr("BPX_D")
datatable(var_detail)

To impalement the rest of functions:

We need the docker database to add a table that mapping short names(eg. BPX_C) to the long descriptive name.

Step 3: check tables:

1) Given the information from the above two steps, we can query data by specifying tables and columns.

tables <- c("BPX_D","BPX_E","BPX","BPX_C")
cols <- c("BPXDI1","BPXDI2","BPXSY1","BPXSY2")
# bpx_data <- queryData(tables,cols)
# datatable(bpx_data)

Check the data availability over the years

In the following matrix, 0 means the tables have no data in that year, and 1 means they have data in corresponding years.


# mtx <- check_data()
# DT::datatable(
#   mtx
# ) |> DT::formatStyle(colnames(mtx),
#                  backgroundColor = styleInterval(c(0, 1), c('gray', 'red',"white")),
#                  fontWeight = 'bold')